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METHOD FOR DETERMINING THE INFORMATION CAPACITY OF A PAPER 
CHANNEL AND FOR DESIGNING OR SELECTING A SET OF BITMAPS 
REPRESENTATIVE OF SYMBOLS TO BE PRINTED ON SAID CHANNEL 

Background of the Invention 

The subject invention relates to a method for determining the information 
capacity of a printed symbol communications channel. More particularly, it relates to 
measurement of the information capacity of a paper channel including a symbol input 
defining symbols to be printed, a bitmap generator responsive to said symbol input to 
generate input bitmaps representative of corresponding input symbols, a printer 
responsive to said input bitmaps to produce printed symbols substantially determined 
by said bitmaps on a substrate, and an imager to capture images of said printed 
symbols from said substrate and generate corresponding image signals. 

As used herein the term "paper channel" refers to a communications channel, 
part of a complete communications channel, where information is input as a sequence 
of symbols; the sequence is coded, typically by a bitmap generator, as a corresponding 
arrangement of symbols, which are printed on a substrate; and, the substrate is 
scanned to generate an image signal (hereinafter sometimes "image") as output of the 
paper channel. Typically, the image signal is then processed by a recognition system 
to determine the input symbol sequence. The input symbol sequence can also 
incorporate redundancies so that an error correction system can process the output of 
the character recognition system to recover more accurately the input symbol 
sequence. 
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The complete communication system, sometimes referred to herein as the 
printed symbol communications channel, from input to recovery of the symbol thus 
involves the paper channel and the recognition system and possibly an error correction 
system. The information capacity of the complete channel is limited by the capacity of 

5 the paper channel. However the limitations of the paper channel can be masked by the 
effects of the recognition system and error correction system. Thus, for example, when 
evaluating a bar code printer it can be difficult to separate the effects of the recognition 
and error correction systems from the print quality characteristics of the paper channel. 
Particularly, effects of changes in the coding of the bar code generator (i.e. graphic 

10 design of the symbols) can be difficult to identify. System developers can be misled by 
experiments performed with a paper channel and an imperfect recognition system. For 
example , a change in the bar code generator coding that reduces the paper channel 
information capacity can show improved overall reading. Thus, the problem of 
optimizing the whole system is subverted by accepting a change which improves the 

15 matching to a to suboptimum recognition system. 

The term "bitmap" as used herein refers to the ideal or nominal symbol design 
which is stored in and sent to the printer from the "bitmap generator". Actual storage of 
the "bitmap" can be in any convenient form such as an actual bitmap, line art, or simply 
a signal to print a particular symbol such as occurs with a line printer. Where an actual 

20 bitmap is stored, resolution of the stored bitmap and the scanned image need not be 
the same. Whatever form the actual storage of the symbol design takes, for purposes 
of the following analysis it is assumed, without loss of generality, to be transformed into 



Page 2 of 32 



*- 138 U.S. Express Mail EK731069815US 

a lattice of pixel values, i.e., an actual bitmap, having the same resolution as the 
scanned image. 

Thus, it is an object of the subject invention to provide a method for evaluating 
the information capacity of a paper channel and the effects of design changes on that 
5 capacity. 

Brief Summary of the Invention 

The above object is achieved and the disadvantages of the prior art are 
overcome in accordance with the subject invention by a method and a paper channel 
designed according to that method where the information loss, or conditional entropy, of 

10 a paper channel, the paper channel including; a symbol input defining symbols to be 
printed, a bitmap generator responsive to the symbol input to generate bitmaps 
representative of corresponding input symbols, a printer responsive to the bitmap 
generator to print on a substrate symbol images substantially determined by the 
bitmaps, and an imager to capture images from the substrate and generate 

15 corresponding image signals, where the method includes the steps of: selecting a 
general, parametric, statistical model for the paper channel; selecting a plurality of test 
bitmaps; transmitting the test bitmaps through the paper channel to obtain a set of test 
image signals for each of the symbols, each of the sets containing at least one test 
image signal; adjusting parameters of the model so that image signals predicted by said 

20 model for the set of test bitmaps substantially conform to the sets of test image signals, 
so that a particular parameterization of said model substantially accurately describing 
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said paper channel is obtained; and determining an estimate for the information loss of 
the channel in accordance with said particular parameterization. 

In accordance with one aspect of the subject invention, the model is defined in 
terms of a random variable S representative of a scanned image on a lattice 

5 corresponding to a print field, and a second random variable B corresponding to a 
bitmap input to said paper channel; and wherein said random variable S takes on 
values si; at points j in said lattice, where / labels an image selected from a set of 
possible images, and wherein said random variable B takes on values tf c at points j in 
said lattice, where c labels a symbol selected from a set of said symbols to be printed. 

10 In accordance with another aspect of the subject invention, the estimate for the 

information loss of the channel is determined by the further steps of: selecting one of 
the symbols to be printed from at least a subset of the symbols to be printed, and, for a 
predetermined number of iterations; computing a random value for an image signal in 
accordance with a conditional probability distribution for the image signals assuming the 

15 selected symbol, said conditional probability distribution being determined by the 
particular parameterization; for the selected symbol determining, in accordance with the 
particular parameterization, a conditional probability of the selected symbol, assuming 
the computed random output image value; over the predetermined number of iterations, 
determining the mean conditional entropy, or information loss in transmitting the 

20 selected symbol over said paper channel, as a function of the conditional probabilities 
determined in sub-step f2) repeating these steps for all remaining ones of the subset of 
symbols to be printed; and averaging the conditional entropies determined over all of 
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said test symbols to determine an approximate measure of the channel entropy, or 
information loss in bits per printed symbol. 

In accordance with another aspect of the subject invention, a design for, or 
composition of, a component of a paper channel, the paper channel component being: 
a bitmap generator responsive to the symbol input to generate bitmaps selected from a 
stored set of bitmaps and representative of corresponding input symbols; a printer 
responsive to the bitmap generator to print on a substrate symbol images substantially 
determined by the bitmaps; an imager to capture the images from the substrate and 
generate corresponding image signals; the substrate; an ink used by the printer; or the 
set of bitmaps; by the steps of: determining an average information loss per symbol 
when a first design or composition is used for the component; comparing the average 
information loss per symbol for the first design or composition with a previously 
determined average information loss per symbol when a previous design or 
composition is used for said component; and selecting whichever of said designs or 
compositions has the lower average information loss per symbol. 

Other objects and advantages of the subject invention will be apparent to those 
skilled in the art from consideration of the detailed description set forth below and the 
attached drawings. 

Brief Description of the Drawings 

Figure 1 shows a schematic block diagram of a generalized, conventional paper 
channel. 



Page 5 of 32 



F-138 U.S. Express Mail EK731069815US 

Figure 2 shows a group of test patterns used in the determination of parameter 
values for a statistical model of a paper channel in accordance with the subject 
invention. 

Figure 3 shows various representations of a symbol. 
5 Figure 4 shows a flow diagram of the determination of parameter values for a 

statistical model of a paper channel in accordance with the subject invention. 

Figure 5 shows a flow diagram of the determination of estimated conditional 
entropy (i. e. average information loss per symbol) for a paper channel. 

Figure 6 shows a flow diagram of a method for selecting a set of bitmaps, or 
10 nominal symbol designs, to reduce or minimize the conditional entropy of a paper 
channel. 

Detailed Description of Preferred Embodiments of the Invention 

Figure 1 shows conventional paper channel 10. Channel 10 includes: symbol 
input 12, bitmap generator 14, printer 16, substrate 20, and imager 22. 

15 Symbol input 12 can be any convenient source of input signals which specify 

symbols to be printed, for example, a keyboard, a tape or disk drive, or the output of 
another channel. The input signal can not only specify a particular symbol (e.g. the 
letter "c") but also include font selection and formatting information which will modify the 
printed symbol selected to represent the symbol, (e.g.: c (lower case) - C (upper case) - 

20 C (upper case, italic) - C (upper case, bold)- cjlower case, underlined ) 
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In one embodiment of the subject invention, paper channel 10 may comprise a 
bar code printer. A bar code consists of an array of modules that are different optical 
densities. For example, a commercially available bar code, which is marketed under 
the trade name "DataMatrix," consists of a two-dimensional array composed of white 
5 and black square modules. One way to describe a "DataMatrix" bar code is as an array 
of symbols selected from a set consisting of two symbols: a black module and a white 
module. The symbols described herein thus include bar codes and other printed 
encoding schemes in addition to alphanumeric characters and the like. 

Bitmap generator 14 codes the input signal to send output B representative of 
10 selected nominal symbol designs (i.e. "bitmaps") to printer 16. (As noted above this 
output can be transformed to a bitmap having the same resolution as imager 22 without 
loss of generality.) 

In a preferred embodiment, the subject invention employs a mapping of the 
graphical design of the character or symbol to a bitmap with the same resolution as the 

15 image. Each symbol has an ideal graphical design. (Depending upon the application, 
the ideal design can be either a continuous graphic design, such as design 47, or an 
input bitmap, such as is illustrated by pattern 48.) In a preferred embodiment, the ideal 
graphical design is divided into a grid with the same resolution as the imager. If a given 
element of the grid, corresponding to lattice site j, is more than half covered with the 

20 high optical density part of the symbol design, then the field bj is set to 1 . Otherwise it 
is set to 0. Alternatively, the field bj is a continuous variable set the percentage of the 
grid rectangle that is covered with the high optical density of the graphical design. 



Page 7 of 32 



F-138 U.S. Express Mail EK731069815US 

Those skilled in the art will recognize that many similar schemes can generate a bitmap 
representative of the graphic design with resolution equal to the imager resolution. 

Printer 16 can be any convenient form of print engine such as an ink jet printer or 
a laser printer. Printer 16 responds to the output of bitmap generator 14 to print 
symbols which approximate, more or less closely, the nominal symbol design 
represented by the bitmap on substrate 20, which is typically one or more sheets of 
paper, though printing on any suitable surface, such as plastic sheets, is within the 
contemplation of the subject invention. 

Sheets 20 are then (possibly after substantial delay and/or transport over 
substantial distance) scanned by imager 22 to produce an output S for further 
processing as described above. Typically imager 22 is a raster scanner and output S is 
a time sequence of signals corresponding to sites in a symbol lattice, hereinafter 
sometimes "image pixels", but imager 22 can also be a "camera" which captures an 
image of a symbol, or larger portion of the print field with a matrix of detectors and 
output S is an array of parallel signals corresponding to lattice sites. 

Output S is then input to recognition system 24 to form printed symbol 
communications channel 26 which produces an output of recovered symbols R 
approximating the symbol input from symbol input 12, as will be discussed further 
below. 

Generally, channel 10 is noiseless from symbol input 12 to input B to printer 16. 
That is, a particular input symbol specifies a particular, corresponding output bitmap b, 
and a particular output b determines a corresponding input symbol, possibly with 
corresponding format or font selection information. Thus, the information capacity of 
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channel is limited by the characteristics of printer 16, ink 18, substrate 20 and imager 
22. 

In the discussion below, input B will be considered to be a lattice random 
variable which takes on particular values b c representative of the bitmap for symbol c. 
The value of b c at the jth lattice site, hereinafter sometimes bitmap pixel, is bj. S will 
be considered to be a lattice random variable which takes on particular values s, 
representative of the ith image selected from the set of all possible images. The value 
of Sj at the jth lattice site, hereinafter sometimes image pixel, is sj. 

Channel 10 can also include test pattern generator 24 which inputs test pattern 
bitmaps designed to reveal local distortions introduced by printer 16, ink 18, substrate 
20, or imager 22 to printer 16. In other embodiments bitmap generator 14 can generate 
test pattern bitmaps. 

Figure 2 shows a number of possible test patterns illustrative of test bitmaps tb c 
which can be used to develop a statistical model describing paper channel 10. Test 
pattern 30 is an "all white" pattern corresponding to a null test bitmap where no pixel is 
asserted. Test pattern 32 is an "all black" pattern corresponding to a test bitmap where 
every pixel is asserted. Patterns 34 and 38 are parallel sets of interleaved, relatively 
thin bars and half bars, with vertical and horizontal orientation respectively. Pattern 40 
is an arrangement of relatively thick half bars in vertical and horizontal orientations. 
Patterns 30 through 40 are typical of test patterns that have been developed by those 
skilled in the to clearly show typical local print distortions. These or other similar test 
patterns which are known to, or can easily be designed by, those skilled in the art are 
input to printer 16, and the resultant image signals s are analyzed to estimate 
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parameters for a statistical model describing channel 10 in a manner which will be more 
fully described below. Similar test patterns are often used by those skilled in the art of 
evaluating print quality. Typical print quality parameters are modulation and graininess, 
print growth, edge roughness and waviness. 

5 Figures 42 and 44 show test patterns used in another embodiment of the subject 

invention. Patterns 42 and 44 are arrangements of a subset, which can be the full set, 
of test symbols generated from corresponding test bitmaps tb c and selected from a set 
of symbols to be printed. Since it has been found that print symbol distortions are local 
( i.e. the probability of a particular image signal p(s, | b c ) being generated by a symbol 

10 printed in response to bitmap b c , at a particular location in the print field ( hereinafter 
sometimes "page") is substantially independent of other symbols printed on the page) 
test symbols which are repeated in test patterns such as patterns 42 and 44 can be 
considered as repeated instances of the same test symbol. 

The selected test patterns should reflect the typical features of the symbols 

15 employed in the channel. 

Figure 3 illustrates various representations of a typical symbol. Design 46 is an 
ideal graphical symbol design such as is produced by a typographic designer. Pattern 
47 illustrates an input bit such as would be stored in bitmap generator 12. Pattern 48 
illustrates an image signal captured by imager 22. Note that pattern 48 does not 

20 necessarily have the same resolution as pattern 47 and differs from pattern 47 by the 
random addition of pixels 49 and dropping of pixels 50. The probability distribution of 
images produced by a given input bitmap (e.g. pattern 48 produced by pattern 47) can 
be described by a local statistical paper channel model as described below. The paper 
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channel model provides a mechanism for deriving the relationship between the paper 
channel information capacity and print quality parameters. 

This relationship is independent of the recognition process. The ability to 
separate between limitations of the paper channel process and limitations of the 
recognition process allows sequential rather than simultaneous optimization of the two 
channels. When a recognition system is tuned to interpret captured images from a first 
paper channel of a first application, and is then employed to interpret captured images 
from a second paper channel of a second application, it usually under performs. Fault 
may be found with the second paper channel, when in reality the fault is with an 
unmatched recognition system. For example, if the recognition system employs a fixed 
binarization threshold, and the new substrate has a background optical density that is 
too close to the threshold, then the system will fail to perform well, even though the new 
paper channel, excluding the recognition process, may have sufficient information 
capacity. 

The information capacity l(B,S) of paper channel 10 is: 

\{B,S) = H(B) - H(S|S), where H(S) is the entropy, or information capacity, 
of input 6 to printer 16 (i.e. the amount of information which can be conveyed by 
selection among various particular values b c of B), and H(6|S) is the conditional entropy 
of 8 assuming S. \{B,S) can be considered as the average amount that uncertainty 
about particular values of input B is reduced by knowing the values of output S 
produced. Thus, H(B\S) is the information loss of channel 10. i.e., the amount by which 
channel capacity \{B,S) is less then H(S) the information capacity of input S. 

The information capacity of B is: 
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H(B) = -ZP( b c)-log 2 (p(b c )); where 

£ represents summation over all bitmaps b c , and p(b c ) is the probability 

c 

of b c . (Note: hereinafter all logs are base 2 unless otherwise stated.) Assuming that the 
distribution of B is uniform, that is p(b c ) = 1/N C where N c is the number of symbols, then 

5 H (B) = - ]T (1 / N c ) • log(l / N c ) = \og(N c ) . 

The information capacity of channel 10 is thus determined by the conditional 
entropy, or information loss H(B\S). A uniform distribution with each character having 
probability 1/Nc maximizes the information per character in the message. The 
information capacity usually differs from symbol to symbol. A non-uniform distribution 

10 of the probability for a symbol, favoring symbols with higher channel information 
capacity, maximizes the channel capacity. Other non-uniform distributions are 
determined by the message space and the encoding scheme of the particular 
application. While a uniform distribution is not necessary, it is a reasonable assumption 
in the absence of information about the statistics of the source, i.e., the distribution of B, 

15 and will be made for the following analysis unless otherwise stated.) 

To evaluate channel 10, it is first necessary to characterize channel 10 by 
developing a model which describes its operation (more particularly, which describes 
the operation from input of signal B through output of signal S, since selection of values 
b c is assumed to be noiseless.) Figure 4 shows the development of such a model in 

20 accordance with the subject invention. 
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At step 52, a general, parametric, statistical model suitable to describe the 
operation of channel 10 is selected. The statistical relationship between input bitmap 
and the image is local in two-dimensions. The probability distribution for optical density 
of a given printed symbol pixel is related only to neighboring bitmap pixels bj and there 
is no long-range interaction between different parts of the image. Thus, the joint 
probability distribution for two sufficiently separated image pixels sj is the product of 
their individual distributions. One bitmap pixel bj modifies the probability distribution of 
several nearby image pixels sj , and the probability distribution for each image pixel is 
modified by the values of several bitmap pixels. These modifications result, in 
combination with properties of paper 20, printer16 and imager 22, in image quality 
characteristics, such as print growth, background noise, modulation, contrast, and blur. 
These image quality characteristics are common descriptors of print quality and image 
quality. A parameterized statistical model can describe the probabilistic relationship 
between the input bitmap Sand output S. The model parameters are determined by 
matching the statistical quality characteristics of a set of test images, as will be 
described further below. 

In a preferred embodiment of the subject invention, a model analogous to the 
energy function of generalized two-dimensional Ising model is an appropriate choice. 
The Ising model used in the preferred embodiment described below, produces a 
statistically distributed, locally interacting, binary random variable, or spin, on each site 
of a two-dimensional lattice in the presence of a field on the lattice, to model threshold, 
binary image pixels sj produced by bitmap B. As noted above, before applying the 
model, a coordinate transformation is applied to bitmap B to line it up with the image. 
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An Ising model can have several parameters. The preferred model presented here has 
four parameters. Transformed bitmap B is converted to a position-dependent applied 
field that has a value b in k if the corresponding image pixel should be dark and b pa per if 
the corresponding image pixel should be light. The tendency for neighboring pixels to 
5 assume the same value is described by a nearest neighbor coupling factor J. Each 
image pixel s/ assumes a value +1 for a dark image pixel and -1 for a light image pixel. 

In the preferred model a function, analogous to the statistical mechanical energy 
for Ising model spins, is: 

J \ nn J 

io where the second sum is over nearest neighbors of the point j and J is a nearest 

neighbor coupling factor and L is a coupling factor between bitmap b and image s which 
will be described further below. The conditional probability for a given output s,-, given 
an applied field b c is: 

exp(-£[s, |&J) 

p[s/|6 c ] = ^ eX p(-£[,s | b ]) ' wnere 2 represents summation over all 

15 particular values s for output S. 

In other embodiments of the subject invention, models retain the form described 
while the values of bj may be a binary (to model monotone images), an integer (to 
model discrete gray-scale), a continuous variable (to model gray-scale), or a vector 
value (to model discrete or continuous color gray-scale. Similarly the value of sj may 

20 be a binary, an integer, a continuous variable, or a vector value. 
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In general, more complicated models having more or more complex parameters 
are required to model more accurately print channel characteristic;, however, the 
preferred model accounts for important typical image characteristics without undue 
computational complexity. It is believed that such models can readily be developed by 

5 those skilled in the art from the descriptions set forth herein and use of other models is 
within the contemplation of the subject invention. These models will look very similar to 
the above binary model, with additional terms proportional to powers of the s, on one 
site and terms proportional to products of powers of s,- on neighboring sites, and for 
vector value s, terms proportional to products of powers of the components of s on one 

10 or neighboring sites. These terms can similarly be calculated by matching print quality 
characteristics. In a paper channel with reasonably good print quality, there is no long 
range correlation introduced into the image; the image in one area is independent of the 
image a few pixels away. Therefore, it will not be necessary to include a large number 
of terms to obtain a good model of the paper channel. 

15 Typically, appropriate forms of models will have been previously determined, and 

a person skilled in the art will select the general form of model from knowledge of the 
type of channel to be evaluated. 

Then, at 54, the next test bitmap tb c is selected and, at 56, sent over channel 10 
to produce a test output fs,. At 60, it is determined if the selected test bitmap is to be 

20 sent again to generate another particular output image tsr. If so, the process returns to 
56; and otherwise, at 62, determines if this is the last test bitmap. If not, the process 
returns to 54. 
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Otherwise, at 64, the model parameters are adjusted to reflect the test output 
images fsy. The preferred model reproduces many of the image quality characteristics 
of a typical printer-camera system, such as print growth, background noise, modulation, 
contrast, and blur. These image quality characteristics are common descriptors of print 
quality and image quality. The distribution of these or other print quality characteristics 
can be determined by examination of the test output images f s > . The three parameters 
J, bink and b pa per can be determined by adjustment to match print quality characteristics 
of test outputs tSj. A large positive value of b ink produces a uniform dark image. A large 
negative value of b pap er produces a clean background. A large value of J produces 
strong correlation between neighboring sites, so small white or black islands or details 
in the bitmap tend to disappear in the image. An asymmetry between b pap er and b in k 
combined with a comparable value of J results in print growth or print shrinkage. By 
considering these properties of the model and considering the print quality 
characteristics of test outputs fs// a person skilled in the art can approximate model 
parameter values which will describe channel 10. 

A preferred method is to calculate correlations within images s, and between 
images s,- and bitmaps b c . a good model will reproduce the correlations found 
experimentally, and deviations can be used to correct the model parameters. 

It should be noted that test bitmaps tb c and test output images fs, are 
conceptually identical to bitmaps b c and output images s iw except that in some cases, 
they cover a larger portion of the page. In other cases, such as test patterns 42 and 44, 
a group of test bitmaps each covering a part of the page is sent through channel 10 as 
a single pattern. Because the dependence of the output image s, on bitmap b c is local, 
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test bitmaps tb c such as are illustrated in test patterns 42 and 44 can be used to 
estimate model parameters which are also local in two-dimensions.) 

A particular example where the model parameters vary with position in the image 
is the common case where the lighting is not uniform. Non-uniform lighting produces 
reduced contrast in areas that are under or over illuminated. Over illumination 
produces blooming and narrowing of dark areas. Under illumination produces growth of 
dark areas. This effect can be modeled by multiplying the bitmap field bj by a selected 
illumination field Lj selected from a random variable L representing the variation in 
illumination. Generally, L will exhibit long wavelength variation. If L is distributed 
according to the variation in illumination, then the model will exhibit local characteristics 
comparable to the noted growth and shrinkage of dark areas. 

Another type of local random variation is preprinting or texture on the substrate. 
Those skilled in the art will recognize that other similar types of position dependent 
variability can be included in the model. 

At 68 the model is then used to generate random images assuming test bitmaps 
tb c , preferably using the Metropolis Monte Carlo algorithm. These computed images are 
then compared with the test output images fs, and if they are consistent the process 
ends. Otherwise it returns to 64 to further adjust the model parameters. The 
comparison is consistent if the distribution of print quality characteristics is substantially 
similar for the computed image an the actual test output images tsi. (The Metropolis 
Monte Carlo algorithm is a known algorithm for generation of random results for a given 
statistical model and need not be discussed further here for an understanding of the 
subject invention.) 
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Once satisfactory model parameters have been determined, the information loss 
and capacity for channel 10 are given by: 
\(B,S) = H(B) - H(B\S) 

H(B) = -5>(& c )-logQK& c )); and assuming p(b c ) = 1/N C , H(6) = log N c , 

c 

where N c is the number of symbols 

The information loss, H(B\S), is the average over all output values s, of the 
conditional entropy of B given that S = s,-, given by: 

H(8|S)=X^ ( E^ck ( )log^cl^) 

To evaluate H(B|S), p(s,) is given by: 

P(*/) = ZW*.»*c)) = ZM*, lO-^J; and 

c c 

p(b c I s,) is given by: 

(1) p{b G | s/) = p(s,\b c )-p(b c ) ; and, assuming p(Jb c ) = 1/N C , 

Xp^\bAp{b c ) 

c 

(2) p(ib c | S/ )= p( Sl \b c ) 

Thus, it will be apparent that H(B\S) and, thus, l(6,S) can be derived from the 
model of Figure 4 using (2 ) where p{b c ) is assumed constant and (1) where p(b c ) is not 
constant. However, depending on the statistical mechanical model for s, , analytic 
calculation of p(s,- ) may be difficult. Every possible image has at least some very small 
probability of arising from any symbol, so the sums over i can have many terms. An 
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alternative statistical approach is to estimate the information loss based on the 
statistical model derived from image quality parameters as shown in Figure 5. 

At 80, the next symbol from the set of symbols to be printed is selected. Note 
that, in general, all possible symbols that may be printed in the future are not known. 
However, those skilled in the art will be able to select a set of N c symbols which will be 
sufficient to evaluate channel 10, at least for particular applications of interest. These 
N c symbols are an integral part of the channel under evaluation. If a second set of 
symbols is employed, then a second channel is created and must be evaluated. The 
same model parameters can be employed if the printer, substrate, ink and imager are 
the same, and the new font characteristics are consistent with the test patterns used in 
the first channel. Similarly, if a barcode bitmap generator is modified, the 
corresponding paper channel must be re-evaluated. Examples of such bitmap 
modification include changing the size of the modules, changing the relative size of 
black and white modules, or changing the print density by varying spot size or density. 

At 82, a random output image s,- corresponding to bitmap b c for the selected 
symbol is computed and saved, preferably using the Metroplis Monte Carlo algorithm. 
At 84, the conditional entropy for the selected symbol is computed as: 
H(£> c | si) = p(jb c |s/) log(p(b c |s,)); where p(b c \ si) is given by: 

p(Ms/)= fffcm 

Y,p{s, \K)> a s above. 

c' 

If another random output is to be computed then, at 86 the process returns to 82. 
At 90, the average of the conditional entropies for the selected symbol is computed and 
saved. Preferably, about 100 computed values of outputs s, will be used to obtain a 
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sufficiently accurate measure of information loss for the selected symbol. The number 
of samples of s depends on the desired accuracy of the measurement , but even a few 
samples are sufficient to get an estimate of the capacity. 

If, at 92, another symbol is to be evaluated, the process returns to 80. 
Otherwise, at 94 the average information loss per printed symbol is computed and the 
process ends. 

If another non- uniform distribution for B is assumed then equation (1) can be 
used at 84 to estimate the conditional entropy as: 
P(*>c I si) = p{ Sl \b c )-p(b c ) 

Xp{s t \b,\p{b c Y 

c 

using the assumed values for p(b c ), and the average computed at 94 is weighted in 
accordance with the assumed distribution of B. 

Either of the methods described above, analytic or statistical, provides a 
measure or estimate for the information loss per symbol and information capacity in 
generalized paper channel 10 which is independent of the effects of any recognition 
algorithm or error correction code which is used. These values provide a valuable 
figure of merit which can be used, for example, to evaluate bar code printers to be used 
in a communications channel which includes a paper channel without need to separate 
out he effects of recognition algorithms and/or error correction codes. 

The method of the subject invention can also be used to modify the design or 
select particular components for channel 10 (i.e., particular choices for printer 16, ink 
18, substrate 20 and imager 22, or for the set of input bitmaps B.) Figure 6 shows an 
application of the subject method to the design of input bitmaps B which is optimized for 
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a particular physical component of channel 10; i.e., printer 16, ink 18, substrate 20 and 
imager 22. 

At 100, first set of "previous" bitmaps B P = 0 having null information content, 
(e.g. for all c, b c = 0) so l(B P ,S) = H(S P ) - H{B P \S) = 0, and the information loss H(S P |S) 
5 is maximum. 

At 102, a next bitmap to be evaluated B N is selected. B N can be selected from 
an existing group of bitmaps or can be generated by incremental changes to B P . Such 
changes can be either small random changes or can be guided by the knowledge and 
experience of a person skilled in the typographic arts. 

io At 104, the information loss for the combination of input bitmaps B N and the 

physical channel under consideration H(S W |S) is determined in a manner described 
above. (It should be noted that when evaluating sets bitmaps B in this manner, a 
representative subset, which can be the full set, of bitmaps b c can comprise the test 
bitmaps. If possible, it is better to use a complete set. Obviously, for a bar code such 

15 as commercially available barcodes marketed under the trademarks "PDF417" or 
"DataMatrix", it is not possible to test all bar codes, and a representative subset must be 
taken. For example, a small set of bar code modules (say 2 by 2) with a representative 
surrounding set of "guard" modules.) 

At 106, it is determined if: H(B N \S) < H{B P \S); and if not, at 110, it is determined if 

20 the evaluation of input bitmaps is done. The process can be considered done if pre- 
selected criteria are met. For example, all of an existing group of bitmaps to be 
considered have been evaluated, a pre-determined number of input bitmaps B N have 
been evaluated, a predetermined level of information loss has been reached, or further 
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incremental changes do not produce improvement (i. e. a local optimum has been 
reached). If the process is not done, it returns to 102. If H(B N \S) < H(B P |S); then, at 
112 Bp is set equal to S w and the process returns to 110. When the process is input 
bitmap B P will be optimal for use with the physical channel under consideration in 
accordance with the pre-selected criteria. 

Those skilled in the art will recognize that the method of the subject invention 
can be used in a substantially similar process to optimize the selection or modification 
of the physical components of channel 10. The method of the subject invention can 
also be used to identify the contribution of a recognition system to the information loss 
in a printed symbol communications channel. 

Returning to Figure 1, if r, is a recovered symbol from output R of recognition 
system 24, then a person skilled in the art can easily determine p(b c \ r,); and thus, H(S | 
R), the information loss in complete printed symbol communications channel 26, from 
the error statistics for printed symbol communications channel 26. Subtracting H(S | S), 
the information loss in paper channel 10 gives H(S | R), the information loss in 
recognition system 24. This knowledge can be used to avoid problems such as the 
inadvertent degradation of paper channel 10 to match a suboptimal recognition system 
24, discussed above. 

While the Ising model is simple, sums over the values of spins on a large lattice 
are difficult. An alternative "Gaussian" model allows analytic calculation. In this model 



where the sum over nn is the sum over neighbors of site j. The expression for the 
probability density is 
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exp(-£fcl& c )) 
P(*)={exp(-^> c )) 

The integrals over the values of s can be calculated analytically using a Fourier 
representation. As described above, the model parameters are determined by 
matching print quality parameters of captured experimental images to those produced 
5 by the model. 

The embodiments described above and illustrated in the attached drawings have 
been given by way of example and illustration only. From the teachings of the present 
application, those skilled in the art will readily recognize numerous other embodiments 
in accordance with the subject invention. Particularly, other modifications of various 
10 indicia printed with different geometries will be apparent. Accordingly, limitations on the 
subject invention are to be found only in the claims set forth below. 
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